Skip to main content

Rebalancing Classical Music Task Overview

A. Task Overview

Figure 1. A simplified schematic of the baseline system.

  • A scene generator (blue box):
    • Selects the stereo music signal.
    • Gives the target gains (metadata) for the different instruments in the ensemble.
  • The music enhancement stage (pink box) takes the music as inputs and attempts to make a new mix with the target gains.
  • Listener characteristics (green oval) are audiograms and compressor settings to allow personalised processing in the enhancement stage and are also used in objective evaluation.
  • The enhancement outputs are evaluated (orange box) for audio quality using the Hearing-Aid Audio Quality Index (HAAQI)

Your challenge is to improve what happens in the pink music enhancement box. The rest of the baseline is fixed and should not be changed.

B.1 Original Mixture

  • The original mixture is an ensemble of 2 to 5 instruments.
BassoonCelloClarinetFlute
OboeSaxViolaViolin
  • Four instruments can have a second voice in the same mixture.
FluteSaxViolaViolin

B. Causality

We are interested in both causal and non-causal systems. Non-causal systems could be used for recorded music, whereas causal systems would also work for live listening. The allowed latency for causal systems will be 5 milliseconds, that is, systems cannot look beyond 5 ms into the future. For details about causality, refer to the Causality webpage.

C. Evaluation

Systems will be evaluated using HAAQI [1] objective metric. HAAQI is an intrusive metrics and the reference will be the mixture of the original isolated sources rebalanced using the target gains.

References

[1] Kates, J. M., & Arehart, K. H. (2015). The hearing-aid audio quality index (HAAQI). IEEE/ACM transactions on audio, speech, and language processing, 24(2), 354-365.